
The present invention relates to a method for determination of a 
5 parameter of a system generating a signal containing information 



about the parameter . 

The method may be used for identification of sound or speech 
signals, such as in speech recognition; or for quality measurement 
10 of audio products or systems, such as loudspeakers, hearing aids, 
telecommunication systems, or for quality measurement of acoustic 
conditions. The method of the present invention may also be us,ed in 

?y connection with speech compression and decompression in narrow band 

: 'tl telecommunication. 

m 15 

rf: The method may also be used in analysis of mechanical vibrations 

=0 generated by a manufactured device during operation e.g. for 

detection of malfunction of the device. 

n 20 The method may further be used in electrobiology for example for 
O analysis of neuroelectrical signals such as analysis of signals 

l=J from an electroencephalograph, an electromyography etc. 

BACKGROUND OF THE INVENTION 

25 

Prior art methods of signal processing are based on a short time 
Fourier transform of signals and it is assumed that the signals are 
steady state signals. 

30 In steady state analysis the signal is assumed stationary in the 
period the signal is analysed and the steady state spectrum is 
calculated. 

In real life steady state signals do not occur and steady state 
35 analysis does not provide sufficient knowledge of phenomena within 
various scientific and technological fields. Consider for example 
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speech analysis. The human ear has the ability to simultaneously 
catch fast sound signals, detect sound frequencies with great 
accuracy and differentiate between sound signals in complicated 
sound environments. For instance it is possible to understand what 
5 a singer is singing in an accompaniment of musical instruments. 



It is assumed that the cochlea in the human ear can be regarded as 
comprising a large number of band-pass filters within the frequency 
range of the human ear . 

10 

The time response f (t) for one band-pass filter due to an 
excitation can be separated into two components, the transient 
response, f c (t), and the steady state response, f s (t), 
f (t)=f c (t)+f s (t) . 

15 

Traditional signal processing is based on the steady state response 
f B (t), and the transient response f t (t) is assumed to vanish very 
fast and to be without importance for the perception, see for 
example "Principles of Circuit Synthesis" , McGraw-Hill 1959, Ernest 
20 5. Kuh and Donald O. Pederson, page 12, lines 9-15, where it is 
stated that : 



"only the forced response is considered while the response due to 
the initial state of the network is ignored" . 

25 

Thus, when students are introduced to the world of signal analysis, 
they learn that the transient response, i.e. the response due to 
the initial state of the network should be ignored because it 
vanishes within a very short period of time. Furthermore, it is 
30 rather difficult to analyse these transient signals by use of 
traditional linear methods of analysis. 

The ability of the human ear to hear very short sounds and at the 
same time detect frequencies with great accuracy is in conflict 
35 with the traditional filterbased spectrum analysis. The time window 
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(twice the rise time) of a band-pass filter is inversely 
proportional to the bandwidth, tw=2/ { f u - f j ) , 

where fi is the lower cut-off frequency and f u is the upper cut-off 
frequency . 

5 

Thus, if a rise time of 5 ms is required the consequence is that 
the frequency resolution is no better than 400 Hz. 

As the detection of these transients is in conflict with a high 
10 frequency resolution, the detecting by the human ear of these 
transients must take place in an alternative manner. It has not 
been examined how the human ear is able to detect these signals, 
but it might be possible that the cochlea, when no sounds are 
received, is in a position of rest, where the cochlea will be very 
15 broad-banded. When a sound signal is received, the cochlea may 
start to lock itself to the frequency component or components 
within the signal. Thus, the cochlea may be broad-banded in its 
starting position, but if one or more stable frequencies are 
received the cochlea may lock itself to this frequency or these 
20 frequencies with a high accuracy. 

Today it is known that the nerve pulses launched from the cochlea 
are synchronized to the frequency of a tone if the frequency is 
less than about 1.4 kHz. If the frequency is higher than 1.4 kHz 
25 the pulses are launched randomly and less than once per cycle of 
the f requency . 

Signal processing based on filter bank spectrum analysis is 
disclosed in GB 2213623 which describes a system for phoneme 

30 recognition. This system comprises detecting means for detecting 
transient parts of a voice signal, where the principal object of 
the transient detection is the detection of a point where the 
speech spectrum varies most sharply, namely, a peak point. The 
detection of the peak points is used for more precise phoneme 

35 segmentation. The transient analysis of GB 2213623 is based on a 
spectrum analysis and the change in the spectrum, which is very 
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much different to the transient analysis of the present invention 
which is based on a direct transient detection in the time domain. 
SUMMARY OF THE INVENTION 

5 The present invention provides an approach which is different in 
principle from all known methods for processing signals. The 
approach taken and some of the results obtained will be explained 
by of an example in the context of analysis of speech signals. 

10 Speech is produced by means of short pulses generated by the vocal 
chords in the case of voiced speech and by friction in the vocal 
tract in the case of unvoiced speech. The pulses are filtered by 
the vocal tract that acts as a time-varying filter. The output 
response will consist of quasi steady state terms and also 

15 transient terms. The quasi steady state terms will only be damped 
slightly in the period before the next pulse is generated. The 
transient terms will be sufficiently damped in the time period 
before the next pulse is generated. 

20 The speech signal is often assumed to have only quasi steady state 
terms in the period or time window of the analysis, typically 20-30 
ms . 

The placement of f ormants , the f ormants being energy bands in the 
25 short time power spectrum, are calculated by means of a short time 
spectrum analysis has previously been assumed decisive for speech 
intelligibility, together with voiced/unvoiced detection, the pitch 
and the quasi steady state power. 

30 However, a number of observations, which has been performed within 
the field of auditory perception research, does not conform to the 
previous assumptions : 

Why is it possible to understand and identify a deep male voice 
35 through communication channels that have a higher cut-off frequency 
than the male pitch. 
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The only difference between the pronunciation of the letters: e, b, 
d is in the first 1-3 ms of the voice signal and this information 
will be lost if the analysis have a time window of 20-30 ms . 

5 

How can the absolute placement of these formants be decisive when 
their placement is quite different for different people, 
particularly between small children and large males. 

10 Why is distortion dominated by odd order harmonics and caused by 
cross -over distortion in a class B amplifier much more disturbing 
than distortion dominated by even order harmonics caused by 
amplitude distortion in a class A amplifier. 

15 The short time power spectrum will not distinguish frequencies from 
different sources, and tones generated by other sources than the 
speech signal will act like false formants. 

Why does a signal consisting of three tones with the same 
20 frequencies as the formants for a vowel not give the slightest 
perception of the vowel at all? The signal just sounds like three 
separate tones. 

Why is the ear very sensitive to frequency changes of a signal up 
25 till about 1000 Hz, changes of +/- 3 Hz can be detected. For 
frequencies above 1000 Hz, the sensitivity is much smaller. 

The research performed by the present applicant leads to suggest 
that the ear is tone dominant until about 1.4 - 1.6 kHz and 

30 transient dominant above. Tone dominant means that the pulses 
launched from the hair cells as a response to a tone signal are 
synchronised to the tone signal. Transient dominant means, in the 
present context, that the hair cells are activated by changes of 
the energy with rise and fall times of at most 2 ms typical caused 

35 by transient pulses . 
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Regarding speech signals, it is assumed chat the quasi steady state 
terms are in the tone dominant interval of the ear and that the 
transient terms are in the transient dominant interval. It is 
believed that the transient terms are very important for speech 
5 intelligibility. The transient terms are seen as transient pulses 
in the speech signal. The rise time and the shape of leading and 
lagging edges of the envelope of transient pulses in the terms of a 
profile of damped frequencies describes the sound picture. The 
shape of the leading and lagging edges, the dynamic changes, change 
10 of amplitude, of the transient pulses, voiced/unvoiced detection 
and the changes of pitch are decisive for speech recognition. 

This approach provides a number of advantages with respect to 
explaining the earlier mentioned speech perception observations. 

15 

A natural explanation as to why it is possible to understand and 
identify a deep male voice through communication channels that have 
a higher cut-off frequency than the male pitch is provided. The 
pitch can be detected as the period between transient pulses. 

20 

The absolute placement of formants is not decisive. The damped 
frequencies profile of the shape of the transient pulse envelope is 
dominated by damped difference frequencies of the transient terms. 

25 Distortion caused by cross -over distortion in a class B amplifier 
generates abrupt energy changes (unwanted transients) which are 
much more disturbing than distortion caused by amplitude distortion 
in a class A amplifier which do not generate the same abrupt energy 
changes . 

30 

Robust data- or telecommunication is based on modulation. The 
envelope of transient pulses is a kind of amplitude modulation, 
transient or impulse response modulation, and will have the same 
advantages . 

35 
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It is unlikely thac frequencies from other sources will cause 
interference patterns with the speech signal that gives energy 
changes with time constants and shapes in the range that is 
decisive for speech intelligibility. This means that transient 
5 modulation will be robust in noisy environments and communication 
channels . 



The ear is probably very sensitive to changes of a frequency up 
till about 1000 Hz because the nerve pulses are synchronised to the 
10 frequency and the period between the pulses is a measure for the 
frequency. In the high frequency range, where the pulses are not 
synchronised to the frequency, only placement of the frequency in 
the cochlea is a measure for the frequency. 

15 According co the invention it has for example been found that the 
signal information relevant to recognition of speech is present in 
a transient part of the speech signal. Thus, the method of the 
present invention may involve a separation of the transient part of 
an auditory signal, a generation of a transient pulse corresponding 

20 to the transient part, and analysis of the shape of the pulse. In 
an auditory signal, the corresponding transient pulse may be 
repeated with time intervals, and the cime interval of these 
periodic transient pulses is normally also analysed or determined. 

25 In real life, the human ear reacts to energy changes at high 

frequencies in order to recognise phonemes or sound pictures. But 
in the present method transient pulses corresponding to the energy 
changes observed by the ear are extracted at these high 
frequencies, wherefore the transient pulses preferably are 

30 transformed to the low frequency range still maintaining the 

distinct features of the sound pictures or phonemes. Thus, by using 
the principles of the invention, it is possible to obtain distinct 
features within auditory signals by examining the transformed low 
f requency s i gna 1 s . 
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The invention relates to the use of the shape of energy changes of 
a signal for identifying or representing features of the system 
generating the signal for example in recognition of sound features 
which can be perceived by an animal ear such as a human ear as 
5 representing a distinct sound picture are determined. 

The method of the present invention provides an expression for the 
transient conditions of the auditory signal. The method comprises a 
band-pass filtration of an auditory signal within the frequency 
10 range of the human ear and a detection of a low-pass filtered 

envelope, which envelope then can be analysed with known methods of 
signal analysis. The envelope is an expression of the transient 
part of the signal. 




15 The method of signal analysis, which should be used when analysing 
the envelope, and the characteristics of the band-pass filter, 
which should be selected, will depend on the purpose of the 
analysis. The purpose may be speech recognition, quality- 
measurement of audio products or acoustic conditions, and narrow 

20 band telecommunication. 

The invention also relates to a system for processing a signal to 
reduce the bandwidth of the signal with substantial retention of 
the information of the signal. The system may further comprise 
25 means for extracting the transient component of the auditory 

signal, and it may comprise means for detecting an envelope of the 
transient component . 



A signal may be separated into a sum of impulse responses generated 
30 by poles and zeroes in the system that has generated the signal, if 
the time between the excitation pulses are sufficient long compared 
to the duration of the impulse responses for the system. 

In WO 94/25958 it is shown that the envelope of the transient 
35 component in a speech signal is very important for its recognition 
and it is shown that the envelope of the impulse response will 



SUBSTITUTE SHEET (RULE 26) 



WO 99/48085 




PCT/DK99/00128 



contain exponencial functions and difference frequencies defined by 
the impulse response. 

A method based on damped sinus functions to extract important 
5 features from the envelope signal is described, and examples where 
the method is used on speech signals shows that the features are 
important in speech analysis. 

Before entering into a more detailed explanation of features of the 
10 method of the invention, a few definitions will be given: 

In short time analysis the transient component in a signal is a 
matter of definition. For auditory signals, the idea is to obtain 
an expression that gives a response corresponding to the response 

15 in the cochlea to an abrupt change in the signal energy. An abrupt 
change in the signal energy ^corresponds to the transient component 
in the auditory signal. Thus, in the present context, the term 
"transient component" designates any signal corresponding to an 
abrupt energy change in an auditory signal. The transient component 

20 holds the signal information to be analysed and in order to analyse 
this information the transient component may be transformed to a 
corresponding transient pulse having a distinct shape. Thus, in the 
present context, the term "transient pulse" refers to a pulse 
having a distinct shape and substantially holding the information 

25 of the transient component of the auditory signal and thus 

corresponding to an abrupt change in the energy of the auditory 
signal. As mentioned above the transient part of a sound signal may 
be repeated with time intervals and thus, in the present context, 
the term "periodic" when used in combination with a transient 

30 component, response or pulse designates any transient component, 
response or pulse being repeated with intervals. 

The term "shape" designates any arbitrary time -varying function 
(which is time-limited or not time- limited) and which, within a 
35 given time interval T P has a distinctly different amplitude level 
in comparison with the amplitude level outside the interval. Thus, 



SUBSTITUTE SHEET (RULE 26) 



WO 99/48085 PCT/DK99/00128 

10 

T p is che duration of the shape function when the shape function is 
time- limited, or the duration of the part of the function which has 
a distinctly different amplitude level in comparison with the 
amplitude level outside the time interval. 

5 

In order to extract information from the shape of the energy 
changes, one broad aspect of the invention relates to represent the 
shape of the energy changes by the short time Laplace transform of 
a transient pulse of the signal. However, several methods can be 
10 applied in order to obtain a transient pulse corresponding to the 
change in energy, but it is preferred that an envelope detection is 
being used, where the envelope preferably should be detected from a 
transient response of the energy change in the auditory signal. 

15 The energy change representing the distinct sound picture can be a 
phoneme or vowel or any other sound which gives a sudden energy 
change in an auditory signal. 



It is also an aspect of the invention to provide a method for 
20 identifying, in an auditory signal, energy changes which can be 
perceived by an animal ear such as a human ear as representing a 
distinct sound picture, the method comprising comparing the shape 
of energy changes of the signal with predetermined energy change 
shapes representing distinct sound pictures. For the identification 
25 it is preferred that the shape of the energy changes are 

represented by the shape of a transient pulse of the signal, and it 
is furthermore preferred that the shape of the transient pulse 
should be obtained by an envelope detection of a transient response 
of the energy change in the auditory signal. 

30 

The invention also relates to a method for processing a signal so 
as to reduce the bandwidth of the signal with substantial retention 
of the information of the signal, comprising extracting a transient 
part of the signal. The method may further comprise detecting an 
35 envelope of the transient part of the sxgnal. 
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Known methods of processing signals are based on a short time 
Fourier transform of signals, and it is assumed that the signals 
are steady state signals. 

5 In steady state analysis the signal is assumed stable in the period 
the signal is analysed, and the steady state spectrum is 
calculated. 

In WO 94/25958 it is disclosed that transient pulses are important 
10 for speech coding and decoding in narrow band communication, for 
speech recognition and synthesis, and for sound quality in auditory 
products (i.e. loudspeakers, amplifiers and hearing aids). 

An important part of a transient signal is the exponential 
15 f mictions or damping ratios or time constants. The damping ratio is 
the reason that the impulse response has a finite duration. The 
fact that the transient signal is important for auditory perception 
indicates that the response from the hair cells is dependent on the 
time constants. If this is the case, it is possible that the 
20 damping ratios in the response from nerve cells in general are 
important for the human nerve system. 

Transient signals are also important in many other applications, 
among others signals generated by impacts from defects in rolling 
25 bearings and gear-boxes. 

Based on the transient signal, it is possible to determine the 
natural time constants and frequencies in the system generating the 
signal. Further it is possible to determine the excitation pulses 
30 of the system . 

BRIEF DESCRIPTION OF THE DRAWINGS 

Fig. l shows a time-domain representation of a linear time- 

35 invariant system, 
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Fig. 2 shows the impulse response of a Butterworth low-pass 

filter of 3. order and a cut-off frequency at 700 Hz, 

Fig. 3 shows the response with the filter relaxed for 

5 (< 0 and with a 4000 Hz tone as input at />0, 

Fig. 4 shows the s -plane with poles and the zero for //(<j,co) , 

Fig. 5 shows //(a,co) for co , and co : analysed parallel with the o 

10 axis, 

Fig. 6 shows transient characteristics in speech signals, 

Figs. 7-12 show processed speech signals, 

15 

Fig. 13 shows a schematic of a filter bank according to the 
present invention . 

DETAILED DESCRIPTION OF THE DRAWING 

20 

The importance of the transient part of a signal has been an 
overlooked phenomenon in signal analysis. 

The response of a linear system to either an impulse or a step 
25 function is defined by its transient response properties. 

The relationship between the input and the output for the linear 
time -invariant system shown in Fig. 1 can be written as the 
convolution of the input signal and the impulse response of the 
30 system: 



v„(t ) = Jv,(x)/i(/ - x)dx ( l ) 
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If the syscem is initially relaxed and the input signal v,(/) is 
zero for / < 0 then the lower integration limit of Eq. (1) can be 



is performed by the system. It states that the input signal is 
weighted or multiplied by the impulse response at every instant in 
time and, at any specific point in time, the output is the 
summation or integral of all past weighted inputs. 

The impulse response of a real system has a finite duration and the 
transient response has the same duration. Fig. 2 shows the impulse 
response of a Butterworth low-pass filter of 3. order and a cut-off 
frequency at 700 Hz. Fig. 3 shows the response with the filter 
relaxed for t< 0 and with a 4000 Hz tone as input at />0. 

In many processes v#(/) -will be =» pulse with a short duration and 
v/(/)fe0 before the next pulse will be generated. 

The Laplace transform of a signal v(/) is- defined by 



replaced with zero. Eq . (1) then shows the important role played by 
the impulse response in terms of the actual signal processing that 




(2) 



o 




o 



If v(/) is the impulse response h(t) for a system with 2 complex 
poles 



h(t) =e 




t>0 



(3) 



and o for /<0 and s * -(a 0 ± yco 0 ) . 



the Laplace transform is 
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HU) ^ 



(s -kt 0 +yco 0 )(s +a 0 -yco 0 ) 
or 

5 



//(a,co) = — 11 (4) 

(a +a 0 +/(to -Ko 0 ))(a +a 0 +y(co -co 0 )) 



From Eq.<4) it is seen that for (a,co ) -> (-a 0 ,±co 0 ) , H(a ,co ) -> ±00 . 

10 

This is a well-known phenomenon and a logical consequence of this 
is as follows: 

If the signal analysed is dominated by the impulse response of the 
15 system generating the signal, it is possible to determine the 
natural time constants and frequencies for the system. 



Fig. 5 shows a plot of H(a,co) for co =co , and co =co , . 

20 Analysing a signal along or parallel with the jco axis will give a 
frequency profile for a given a. 

Analysing a signal along or parallel with the a axis will give a 
time constant profile for a given jco. 

25 

If a signal has a time constant profile with significant variations 
for specific frequencies, the signal is transient dominated. 
Opposite if the signal does not vary significantly for any 
frequency, the signal is steady state dominated. 

30 

A short time Laplace transform is defined by: 
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Z,(a ,©./) = Jv,(/ -X)e' { ^ ftaYK d>. ( 5 ) 
o 

in which v s is the signal, L is the transformed signal, a is a time 
constant, and co is an angular frequency. 

It is not possible to calculate the short time Laplace transform in 
the same way as DFT in the discrete time domain because two 

arbitrary exponential functions, e°' and e ht , are not orthogonal 
with respect to each other. 

The short time Fourier analysis in the analogue time domain is 
based on a filter bank method. In this paper an equivalent method 
will be developed for the Laplace transform. 



15 

From Eq. (1) and Eq. (3) : 



v„(/)= jv i (i^X)e' ia ^ t0)X cfk 
o 

20 

+ -Ve'^-^dk (6) 
o 

v„(0 = V(a,a>,t) + V'(a,aj) = u(t) + u(i) 
25 where u (/)is the complex conjugate of u(t) and we have 

Re[z.(a,co,/)] = 4v„(/) (7 ) 
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From Eq. (6) and Eq . (7) it is seen that filtering the signal v,(/) by 
a filter with the impulse response /7(a,co,/) with 2 complex poles 
will represent the reel part of the short time Z,(a.0K/) transform. 

5 If we let v,(r) be equal to the impulse response of a single pole we 
have 



* 



10 



(a -a 0 ) + y(a) -a> 0 ) 



15 



and from Eq. (7) we have 



2*(a-CT 0 )((?-" cos(coO - e~ avf cos(co 0 /)) 



»0- 



(a -ct 0 ) 2 +(co -(0 0 ) 2 



or 



25 



2*(to -O3 0 )(e"" smjal) - g'""' sin(co 0 /)) 
(o -o 0 ) 2 +(co -co 0 ) 2 



v ,(0 _ e" 0,, '((o- -a n )cos(co 0 0-(b> -o) 0 )sin(co 0 /)) 



2A (a -a 0 ) 2 + (to -cd 0 ) 2 



(9a) 
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-c"°'((a -a 0 )cos(co/) -(w -to„)sin(cor)) 

; n — ; r= (9b) 

(a-a 0 ) +(o>-co 0 )- 



5 Eq. (9) is not defined for (a,co) = (a 0 ,a> 0 ) but from (8) we have in 
this case 

/ 

o 

10 = ^-t 0 "*/""*' 

and 

v„(0 = 2^/e'°"' cos(co 0 /) (10) 

15 

and we have v (j (/) — » 0 for / — > oo . 

Eq. (9) shows chat the gain is inversely related to a — a 0 and 
20 co-co 0# and when (a 0 ,co 0 ) is far from (a,co) and e~°' - is small, 

v tf (/)*0. For (a 0 ,co 0 ) <- (a,co) v 0 (/) will have Eq. (10) as the limit. It 
is not immediately to see if Eq. (9) has the maximum energy for 
(a 0 ,cD 0 )<-(a,co) . 

25 In the DC domain Eq. (9) can be written as 



v„U) = Ik- 



es -o 0 



(11) 



30 The maximum for v ( (/) can be found as follows 
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-jf- = [oe- -o n e-°°' = 0 

a/ a - a „ 1 J 



when 



5 ^ _ log(a)-log(a n ) 



a -a 0 



(12) 



and Eq.(ll) will have the maximum for this value. 
It can be shown that /,„ -> ^7 when a-*a 0 . 

10 

When a ~ ct 0 we will have the approximated maximum with /= — 



v 0 (i;) = 2k (e ~ l e ° 0) (13) 



'0 

15 

From Eq.(l3) it can be shown that 



Ike'' 

v„ — ► for a ->a 0 

20 

In Eq.(ii) e' Co/ represent the signal to be analysed and e' at the 
filter. Table 1 shows the result with a filter having 
o = 100 s' : and the signal varying from 1 to 10000 s" 1 

25 It is not surprising that the convolution acts as a low-pass 

filter. The important fact is that the exponential function in the 
DC domain in some way acts as frequencies do in the frequency 
domain . 
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In table 1 v Ml (/ m ) is the result of a convolution where the signal 
is differentiated. The result is, as expected, a high-pass filter 

If we look on Eq.(9a) without exponential functions it can be 
5 written as 



v 0 (/) = 



2£(sin(co/) - sin(co 0 /)) 



CO -co f 



(14) 



10 it is seen that for a) — » oo we will have v — > 0 . 



O : 100 s - 




',„ 


v„(',„) 




s 

1 

10 
100 
1000 
10000 


s 

0 , 046516871 
0 , 025584279 
0 , 010000000 
0, 002558428 
0 , 000465169 


0, 954548457 
0, 774263683 
0,367879441 
0, 077426368 
0, 009545485 


0, 009545485 
0, 077426368 
0, 367879441 
0 , 774263683 
0, 954548457 



Table l v t9 (t m )is given by Eq.(ll, 12) and 
1 5 normalised by a and 2k. v„,(f, w )is a 

convolution where the signal is 
differentiated and normalised by 2k. 

For 0) «co 0 we will have 

20 



2k(sin((D() - sin(o) 0 r )) 
v « = ( 15 ) 

0)n 



It can be shown that for co — »o 0 we will have 
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v„(')-> 2A/cos(co 0 /) (16) 
5 This resulc is as expected unstable. 

In transient analysis only the beginning of the signal is of 
interest, and if co 0 » 1 Eq.<14> will act as a band-pass filter. 

10 Speech processing is based on fast energy pulse generated by the 
vocal cords or by friction in the articulation channel weighted by 
the impulse response in the articulation channel. The rise time for 
the excitation pulses has to be sufficient faster than the rise 
time of the energy of the impulse response. 



15 



20 



25 



The shape of energy pulses are important features in speech. If the 
time between the pulses is periodical it is voiced speech, and if 
not it is unvoiced speech. For some phonemes abrupt changes in the 
energy pulses are important. 

From WO 94/25958 it is known that the shape of the energy pulses 
are important for speech recognition, especially the leading edge. 
In the following a method to extract features will be developed 
based on an envelope detection. 

The convolution expressed in Eq. (9) can be regarded as a response 
from 2 poles in the articulation channel excited by an impulse. If 
a 0 %a we have from Eq. (9a) 



30 



v„ (0 = ~ " (sin(co/ ) - sin(co 0 / )) { 1 7 ) 
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The envelope is defined as 



e(/) = ViT (/) + «-'(/) 

5 where 



7T/ 



is che Hilbert Transform. 
10 The envelope of Eq. (17) is then 



^-o/ 

e „ (0 = 1 r V(sin(to/) - sin(co 0 /)) 2 + (-cos(aw) + cos(co„/)) 2 

|co — co 0 | 



15 =1 V2(l-cos(co -co 0 )/) 

CO -co, 



'0 
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I Ml -ycos((co -co 0 )/)j (18) 

|co -co 0 | 



The approximation is legal because |cos((to -co 0 )/)| < 1 



As expected the envelope has a component with the difference 
frequency of the 2 frequencies. 

25 

The conclusion is that we can expect to find damped difference 
frequencies in the envelope of the transient component. 
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To decect the damped difference frequencies a filter bank is used. 
The features might be detected as a convolution between the 
transient pulse and the impulse response of the filters. 

5 In general form the impulse response can be written as 
h(t) = ke- Kt sin(/(X )/+$) 
Where a = k and a> = f(k) . 

10 

In the following analysis f(X) = \5X , k = to = \5X , and § = 0 are 
selected and we have 



15 h{t) = \5Xe' >J sin(L5X/) (19) 

By selecting co = 1.5a Eg. (19) will act as a band-pass filter with a 
low Q in relation to the frequencies . Other ratios co/o than 1 . 5 may 
be selected and it is presently preferred that the ratio (co/a) 
20 ranges from 0.5 to 2.5. The exponential function gives the advance 
that it acts like natural time window that ensure that the signal 
is natural damped. The value of the parameters are selected by 
studying rise times in important transient pulses and by 
experiments . 

25 

Fig. 6 shows transient characteristics in speech signals. The top 
figure shows 50 ms of an u a" in "hard key" pronounced by a female. 

The second signal is a band-pass filtration of the speech signal. 
30 The band-pass filter is a Butterworth filter with 6 poles and a 
band width from 2150 to 3550 Hz. This frequency band contains 
important transient pulses in the sensitive frequency interval of 
the ear . 
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The third signal is a energy detection of the transient 
characteristics of the band-pass filtered speech signal. The 
detection is an envelope detection performed by means of a 
rectification and a low-pass filtration of the signal. The filter 
5 is a Butterworth filter with 3 poles and a cut-off frequency at 700 
Hz . 

In WO 97/09712 a method for automatically detecting the leading 
edges is disclosed. The method uses the maximum slope of the 
10 leading edge as reference, and the point before the maximum slope 
where the slope is less than a given threshold (10-20 % of the 
maximum slope) the leading edge is defined to begin. 

The transient (envelope) signal in Fig. (6) has a DC component, 
15 which does not contain any information. Therefore it is preferred 
that the signal is differentiated before it is analysed e.g. by the 
filter bank shown in Fig. 13. 

In Fig. 13, the filters (h x (t), h 2 (t) h„(t)) in the filter bank 

20 connected between the input and the envelope detectors are band- 
pass filters having bandwidths corresponding to the bandwidths of 
the band-pass filters of the cochlea and having centre frequencies 
ranging from 1400 Hz to 6500 Hz. 

25 The output signals Oij(p) from the filter bank shown in Fig. 13 is 
calculated by: 



h iJ (p) = l.5/i m e-'~ p s\n(X M p) , i = o, i n-i 

j = 0 , 1 M-l 

^/(/ ? ) = 0 r P < 0 

f'-\ 

= . p=o, 1 P-l 
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m=0 , 1 , M- 1 and M is the number of band-pass filters with a low Q 
in the filter bank connected between the outputs and the envelope 

5 detectors, p = 0,1 P-i is the sample number, t' is the 

differentiated transient signal, and X tti is the filter bank 
parameter and it is normalised by the sampling frequency. 

In the analysis M is selected to 10 and 1 500 < }J m < 12000 s" 1 , X' w is 

10 not normalised. By this we have 1 885 < co m < 1 8850 s' 1 or 300 < f m < 3000 
Hz. 

This filtering process is not done in the cochlea but in the hair 
cells or in the nerve system behind the hair cells. 

15 

The Figs. 7, 8 , 9, 10, ll, and 12 show the output of the 
processing of transient signals in the vowels "a" , tt o" , tt i" in 
w hard key" and "soft key" pronounced by a female and a male. 
Further the figures show plots of maxima of the output signals as a 
20 function of the time constant of the corresponding filter. 

The figures show that maximum curves are very much alike for the 
same vowels, independent of whether a female or male pronounces it. 

25 With a library of templates and a distance measure it is possible 
to identify the sound picture, and it can be used for speech 
recognition and narrow band communication. 

Thus, according to the invention a method and an apparatus are 
30 provided for determination of a parameter of a system generating a 
signal containing information about the parameter, in which the 
signal is short time transformed substantially in accordance with 
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in which v 4 is the signal, L is the transformed signal, a is a time 
constant, co is an angular frequency, and is a phase, or, in 
accordance with another transformation which will give rise to an 
5 L ' ( a , co , t ) which in time intervals within which L(a,co,t) is larger 
than 10% of its maximum value is not more than 50% different from 
the result given by the short time Laplace transformation. 

In narrow band communication the transient pulses have to be 
10 identified and coded, and the decoder will contain a library of 

filters with corresponding transient responses. The decoder library 
could also contain the transient responses. 

The present invention also relates to measurement of mechanical 
15 vibrations e.g. when testing devices that generate mechanical 
. energy during operation, such as mechanical devices with moving 
parts, such as compressors for refrigerators, electric motors, 
household machines, electric razors, combustion engines, etc, etc. 

20 For example, it is known that measurement of vibration generated or 
sound emitted by a device during operation can be useful for 
detection of malfunction of the device. Certain failures may 
generate sound or vibration of specific characteristics that can be 
recognised. 

25 

The method may also comprise steps of classification for 
classifying a tested device in accordance with the determined 
parameters into one class of a set of predefined classes. Each 
predefined class may be defined by a set of upper and lower limits 
30 for specific parameters determined according to the method. A 

device may then be classified as belonging to a certain class if 
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its corresponding parameter values lie within corresponding upper 
and lower limits of the class. 

Each class may correspond to a specific type of failure of the 
5 device. For example, shaft imbalance, wheel imbalance, crookedness, 
imperfections of teeth in cogs, tight bearing, loose bearings, etc, 
may cause the device to vibrate in different characteristic ways, 
whereby a characteristic mechanical vibration or sound is generated 
for each type of failure. The type of failure of the device may 
10 then be detected by comparing determined device parameters with 
corresponding parameter values of various predetermined classes. 



The upper and lower limits of a specific class of devices may be 
determined by testing a set of devices known to belong to that 
15 class. For example, the upper limits may be determined as the 

average of specific parameter values plus three times the standard 
deviation. Likewise, the lower limits may be determined as the 
average of parameter values minus three times the standard 
deviation. 



SUBSTITUTE SHEET (RULE 26) 



